Section 9 Appendix 2: Data Pre-processing

Data pre-processing can be defined as the work required to go from raw data into data usable for modelling.

The raw data used in this analysis are remote sensing and other geographic information system data, herafter RS data, such as digitized maps of elevation, population density or public services locations. These data need to be aggregated and linked with the survey containing the data on the development indicators we aim to map.

Data pre-processing is arguably one of the most time-consuming tasks of any data science project. This section documents the path required to convert the raw data into data usable for high resolution mapping. Here are the main steps:

  1. Load the RS layers;
  2. Create summaries for RS layers which come with a time dimension (e.g. average precipitation over the last thirthy years);
  3. Perform additional calculation as required (e.g. compute the Soil degradation index based on soil degradation sub-indicator);
  4. Compute distance metrics to public services, businesses and some natural features;
  5. Align the Coordinates reference systems across layers (geographic or projected);
  6. Aggregate the RS layers to the map of interest (e.g. the segmento map) with a chosen set of spatial statistics (average precipitation per segmento);
  7. Match the RS aggregates with the outcome data (poverty, income, literay);
  8. Write the data frame to a .csv file for later use.

Readers only interested by the modelling part can skip this section entirely.

9.1 Data sources

The 2017 household survey Encuesta de Hogares de Propositos Multiples (EHPM), collected by the national statistical office of El Salvador (DIGESTYC), is used to compute the three main development indicators below:

  • Median income,
  • Poverty,
  • Literacy.

The EHPM survey data are stored in the file ehpm-2017.csv. It is publicly available on the DIGESTYC website here. DIGESTYC provided the research team with the segmento identifiers, allowing for linking the RS data with the EPHM data at the segmento level rather than the canton level as it would have been the case if only using the publicly available data. The segmento identifier is stored in the file Identificador de segmento.xlsx.

A suite of RS data are used as potential predictors of the three indicators above. Data sources are listed in Table ?? (see web-book version of this tutorial for the links).

Names Type Source
Population count raster CIESIN
Precipitation raster Climate Hazards Center
Lights at night raster NOAA
Intergrated Food Security Phase Classsification vector FEWS Net
Livelihood Zones vector FEWS Net
Altitude raster NASA Shuttle Radar Topography Mission (SRTMv003)
Soil degradation raster Trend.Earth
Buildings vector OpenStreetMap
Points of interest points vector OpenStreetMap
Roads network vector OpenStreetMap
Vegetation Greenness (NDVI) raster U.S. Geological Survey (USGS)
Schools vector La Direccion General de Estadistica y Censos (DIGESTYC)
Health centres csv La Direccion General de Estadistica y Censos (DIGESTYC)
Hospitals csv La Direccion General de Estadistica y Censos (DIGESTYC)
Temperature raster WorldClim Version2 (Fick and Hijmans 2017)
Slope raster WorldPop Archive global gridded spatial datasets
Distance to OSM major roads raster WorldPop and CIESIN
Distance to OSM major roads intersections raster WorldPop and CIESIN
Distance to OSM major waterway raster WorldPop and CIESIN
Built settlement raster WorldPop and CIESIN
Land cover raster European Space Agency

9.2 Loading the survey

We start by loading the segmento shapefile and the EHPM survey data:

  • ehpm: this is the EHPM survey data
  • segmento_hh_id: this is the segmento id matched to the household id, it allows us to map households at the segmento level
  • segmento shapefile: this is the map of all segmentos

In order to get a more easily readable directory path, let us create an object with the path to the folder where the data are stored:

# modify dir_data to where you stored the data
root_dir="~/"
project_dir="data/"

dir_data=paste0(root_dir,project_dir)
ehpm=read.csv(paste0(dir_data,
                     "tables/ehpm-2017.csv")) 
segmento_hh_id=readxl::read_xlsx(paste0(dir_data,
                                        "tables/Identificador de segmento.xlsx"))

9.3 Loading the vectors

Vector data are polygons (e.g. adminstrative areas), lines (e.g. roads network) or points (e.g. coordinates of schools). They can be stored in various formats, the most common being ESRI shapefiles and geojson files.

9.3.1 Loading the administrative boundaries

Let us now load the administrative maps of the segmentos and departementos withg the command readOGR from the rgdal package. Shapefiles are loaded as a SpatialPolygonDataframe, i.e. a geo-referenced set of polygons to which a data frame is attached.

dir_shape="spatial/shape/"
segmento_sh=readOGR(paste0(dir_data,
                           dir_shape,
                           "admin/STPLAN_Segmentos.shp"))
## OGR data source with driver: ESRI Shapefile 
## Source: "/home/rstudio/data/spatial/shape/admin/STPLAN_Segmentos.shp", layer: "STPLAN_Segmentos"
## with 12435 features
## It has 17 fields
departamentos_sh=readOGR(paste0(dir_data,
                                dir_shape,
                                       "admin/STPLAN_Departamentos.shp"))
## OGR data source with driver: ESRI Shapefile 
## Source: "/home/rstudio/data/spatial/shape/admin/STPLAN_Departamentos.shp", layer: "STPLAN_Departamentos"
## with 14 features
## It has 6 fields

The segmento_sh shape file loaded above consists for instance of a data frame with 17 fields (i.e. variables).

The names of each field can be accesed by running the command below.

names(segmento_sh)
##  [1] "OBJECTID"   "DEPTO"      "COD_DEP"    "MPIO"       "COD_MUN"   
##  [6] "CANTON"     "COD_CAN"    "COD_ZON_CE" "COD_SEC_CE" "COD_SEG_CE"
## [11] "AREA_ID"    "SEG_ID"     "AREA_KM2"   "SHAPE_Leng" "Shape_Le_1"
## [16] "Shape_Area" "demmean"

The data frame attached to the polygon map can be inspected using the following functions.

head(segmento_sh@data) # provide the 5 first rows of the dataframe segmento_sh@data
##   OBJECTID      DEPTO COD_DEP       MPIO COD_MUN      CANTON COD_CAN
## 0        1 AHUACHAPAN      01 ATIQUIZAYA      03 AREA URBANA      00
## 1        2 AHUACHAPAN      01  GUAYMANGO      06  EL ESCALON      04
## 2        3 AHUACHAPAN      01 AHUACHAPAN      01 AREA URBANA      00
## 3        4 AHUACHAPAN      01 AHUACHAPAN      01 AREA URBANA      00
## 4        5 AHUACHAPAN      01 AHUACHAPAN      01 AREA URBANA      00
## 5        6 AHUACHAPAN      01 AHUACHAPAN      01 AREA URBANA      00
##   COD_ZON_CE COD_SEC_CE COD_SEG_CE AREA_ID   SEG_ID AREA_KM2 SHAPE_Leng
## 0         00        002       1015       U 01031015  0.04352  1261.0632
## 1         00        005       0016       R 01060016  1.65778  8536.1704
## 2         01        003       1015       U 01011015  0.07500  1165.9750
## 3         01        003       1014       U 01011014  0.05010   988.8543
## 4         02        009       1019       U 01011019  0.06910  1135.7582
## 5         02        009       1042       U 01011042  0.04690   857.8403
##   Shape_Le_1 Shape_Area demmean
## 0  1261.0217   43526.87      NA
## 1  8535.8890 1657675.21      NA
## 2  1165.9366   74995.98      NA
## 3   988.8216   50104.03      NA
## 4  1135.7209   69104.02      NA
## 5   857.8122   46899.53      NA
summary(segmento_sh@data) # summarise each field of the dataframe segmento_sh@data
##     OBJECTID              DEPTO         COD_DEP               MPIO     
##  Min.   :    1   SAN SALVADOR:3156   06     :3156   SAN SALVADOR: 662  
##  1st Qu.: 3110   LA LIBERTAD :1419   05     :1419   SAN MIGUEL  : 494  
##  Median : 6219   SANTA ANA   :1124   02     :1124   SANTA ANA   : 492  
##  Mean   : 6219   SAN MIGUEL  : 962   12     : 962   SOYAPANGO   : 476  
##  3rd Qu.: 9328   SONSONATE   : 901   03     : 901   MEJICANOS   : 274  
##  Max.   :12437   USULUTAN    : 778   11     : 778   APOPA       : 260  
##                  (Other)     :4095   (Other):4095   (Other)     :9777  
##     COD_MUN                  CANTON        COD_CAN       COD_ZON_CE  
##  17     :1085   AREA URBANA     :4945   00     :4945   00     :4325  
##  10     : 922   SAN BARTOLO     : 157   02     : 914   01     :2763  
##  14     : 864   VERACRUZ        : 145   01     : 908   02     :1747  
##  02     : 836   LAS FLORES      : 100   04     : 871   03     :1021  
##  08     : 809   SAN LUIS MARIONA:  90   03     : 813   04     : 763  
##  03     : 751   LAS DELICIAS    :  84   06     : 673   05     : 504  
##  (Other):7168   (Other)         :6914   (Other):3311   (Other):1312  
##    COD_SEC_CE     COD_SEG_CE    AREA_ID       SEG_ID     
##  002    :1066   1001   :  262   R:5088   01010001:    1  
##  003    : 996   0001   :  259   U:7347   01010002:    1  
##  001    : 877   0002   :  259            01010003:    1  
##  004    : 788   0003   :  255            01010004:    1  
##  005    : 696   0004   :  252            01010005:    1  
##  006    : 568   0005   :  247            01010006:    1  
##  (Other):7444   (Other):10901            (Other) :12429  
##     AREA_KM2          SHAPE_Leng        Shape_Le_1        Shape_Area      
##  Min.   : 0.00000   Min.   :  281.4   Min.   :  281.4   Min.   :    3903  
##  1st Qu.: 0.06062   1st Qu.: 1256.3   1st Qu.: 1256.3   1st Qu.:   60567  
##  Median : 0.27345   Median : 2859.6   Median : 2859.5   Median :  273177  
##  Mean   : 1.67100   Mean   : 5494.0   Mean   : 5493.8   Mean   : 1672961  
##  3rd Qu.: 2.08214   3rd Qu.: 8492.5   3rd Qu.: 8492.3   3rd Qu.: 2082009  
##  Max.   :48.36518   Max.   :54684.4   Max.   :54682.6   Max.   :48194678  
##                                                                           
##     demmean     
##  Min.   : NA    
##  1st Qu.: NA    
##  Median : NA    
##  Mean   :NaN    
##  3rd Qu.: NA    
##  Max.   : NA    
##  NA's   :12435

We see that Shape_Le_1 and SHAPE_Leng are duplicates, while demmean only comprises of null values. Let us now remove these fields using the dplyr function select.

segmento_sh@data=segmento_sh@data%>% 
  select(-c(Shape_Le_1,demmean))

In order to get a country mask, i.e. the boundary of the country, we dissolve the multiples polygons of the departamentos shapefile in one overall map of the country with the function maptools::unionSpatialPolygons.

SLV_adm0_proj=maptools::unionSpatialPolygons(departamentos_sh,
                                             rep(1, nrow(departamentos_sh)))

The country mask is shown on figure 9.1.

plot(SLV_adm0_proj)
El Salvador country mask

Figure 9.1: El Salvador country mask

9.3.1.1 Adjusting the coordinates system

There are two families of coordinates systems:

  • Geographic Coordinates Systems (GCS) : “A system that uses a three-dimensional spherical surface to define locations on the earth”5.
  • Projected coordinates systems (PCS) : “A system defined on a flat, two-dimensional surface. Unlike a geographic coordinate system, a projected coordinate system has constant lengths, angles, and areas across the two dimensions”6

The shapefile for the segmentos, departmentos and the country mask have a projected coordinate system.

SLV_adm0_proj@proj4string
## CRS arguments:
##  +proj=lcc +lat_1=13.316666 +lat_2=14.25 +lat_0=13.783333
## +lon_0=-89 +x_0=500000 +y_0=295809.184 +datum=NAD27 +units=m
## +no_defs +ellps=clrk66
## +nadgrids=@conus,@alaska,@ntv2_0.gsb,@ntv1_can.dat

The coordinates are expressed in meters:

sp::bbox(SLV_adm0_proj)
##        min      max
## x 377579.6 642687.1
## y 226506.1 369595.5

Since most spatial RS data are based on a geographic coordinates system, we create a second version of the country mask with a geographic coordinates system. This is done with the function spTransform from the sp library. The use of :: allows to call the function without laoding the the sp library in then environment.

proj_WGS_84="+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
segmento_sh_wgs=sp::spTransform(segmento_sh,
                         proj_WGS_84)
SLV_adm0=sp::spTransform(SLV_adm0_proj,
                         proj_WGS_84)
sp::bbox(SLV_adm0)
##         min       max
## x -90.13196 -87.68389
## y  13.15441  14.44997

The coordinates are now expressed in decimal degrees.

9.3.2 Loading the shapefiles containing the fields used as covariates

We now load the shapefiles containing the fields used as covariates, i.e. containing variables that will be used as predictors in the model we will fit in the next sections.

We start with the Livelihood zone maps accessed from the Famine Early Warning Systems Network (FEWS NET).

LZ_sh=readOGR(paste0(dir_data,
                     dir_shape,
                     "LZ/SV_LHZ_2010.shp"))

It is mapped on ?? for inspection.

leaflet(LZ_sh)%>%
  addTiles()%>%
  addPolygons(weight=1,
              color = "#444444",
              smoothFactor = 1,
              fillOpacity = 1,
              fillColor = ~colorFactor("Accent", LZNAMEEN)(LZNAMEEN),
              popup = ~LZNAMEEN)%>%
  addLegend("bottomright",
            colors = ~colorFactor("Accent", LZNAMEEN)(LZNAMEEN), 
            labels = ~LZNAMEEN,
            opacity = 1)

Figure 9.2: Livelihood zones

Let us now load all the other vectors data.

school_sh=readOGR(paste0(dir_data,
                         dir_shape,
                         "schools/MINED.shp"),use_iconv=TRUE, encoding = "UTF-8")
buildings_poly_sh=readOGR(paste0(dir_data,
                                 dir_shape,
                                 "hotosm/hotosm_slv_buildings_polygons.shp"), use_iconv=TRUE)
poi_points_sh=readOGR(paste0(dir_data,
                             dir_shape,
                             "hotosm/hotosm_slv_points_of_interest_points.shp"))
poi_poly_sh=readOGR(paste0(dir_data,
                           dir_shape,
                           "hotosm/hotosm_slv_points_of_interest_polygons.shp"))
roads_lines_sh=readOGR(paste0(dir_data,
                              dir_shape,
                              "hotosm/hotosm_slv_roads_lines.shp"))
roads_poly_sh=readOGR(paste0(dir_data,
                             dir_shape,
                             "hotosm/hotosm_slv_roads_polygons.shp"))
coastline=readOGR(paste0(dir_data,
                         dir_shape,
                         "coastline/coastline.shp"))

waterbodies=readOGR(paste0(dir_data,
                           dir_shape,
                           "hotosm/hotosm_slv_waterways_polygons.shp"))

proj_lcc=" +proj=lcc +lat_1=13.316666 +lat_2=14.25 +lat_0=13.783333 +lon_0=-89 +x_0=500000 +y_0=295809.184
+datum=NAD27 +units=m +no_defs +ellps=clrk66 +nadgrids=@conus,@alaska,@ntv2_0.gsb,@ntv1_can.dat"
waterbodies_proj=sp::spTransform(waterbodies,
                                 proj_lcc)

rivers=readOGR(paste0(dir_data,
                      dir_shape,
                      "hotosm/hotosm_slv_waterways_lines.shp"))
rivers_proj=sp::spTransform(rivers,
                            proj_lcc)

9.3.3 Loading point-referenced data from tables

Two datasets are provided as R data frames in a .Rda file. They are loaded with the function load:

load(paste0(dir_data,
            dir_shape,
            "hospitales/hospitales.Rda"))
load(paste0(dir_data,
            dir_shape,
            "UCSF/UCSF.Rda"))

The package sp is then used to to turn these data frames into SpatialPointsDataframe objects:

# in the .Rda,there is an error in the naming of the coordinates: latitudes are actually longitudes and vice versa
hospitales=hospitales%>% 
  rename(lon=latitude,
         lat=longitude)
hospitals_sp=as.data.frame(hospitales)

# Features "lon" and "lat" are declared the the spatial coordinates of the data frame "hospitals_sp"
# The data frame "hospitals_sp" becomes am object of the type SpatialPointsDataframe 
sp::coordinates(hospitals_sp)=~lon+lat

# A spatial coordinate system is assigned to the SpatialPointsDataframe "hospitals_sp" 
sp::proj4string(hospitals_sp)=sp::CRS(proj_WGS_84)

The same is done with the health center (not shown).

The resulting map is plotted on ?? for reference7.

leaflet(SLV_adm0)%>%
  addPolygons()%>%
  addTiles()%>%
  addMarkers(lng=coordinates(hospitals_sp)[,1],
             lat=coordinates(hospitals_sp)[,2],
             popup=hospitals_sp$name)%>%
  addCircleMarkers(lng=coordinates(health_centres_sp)[,1],
                   lat=coordinates(health_centres_sp)[,2],
                   radius = 4,
                   popup=paste0(health_centres_sp$name,"\n(",health_centres_sp$tipo,")"),
                   stroke = F,
                   fillOpacity = 1,
                   fillColor = colorFactor("Accent", health_centres_sp$tipo)(health_centres_sp$tipo))
## Warning in writeBin(raw, conn): problem writing to connection

9.4 Loading the rasters

We now turn to the raster files. Raster files can be described as geo-referenced pixel data. For instance, a raster of population density consists of a map of pixels. The value of each pixel is the people density in this pixel. raster data are provided in two main formats: zipped .tif and .ncdffiles.

9.4.1 Lights at night

Lights at night data products are available both per calendar month and per year. The annual data are used for 1983 to 2016. For 2017, the yearly aggregate were not yet released at the time of data pre-processing. Monthly aggregated were hence used.

The first step is to unzip the files with the untar command. For the sake of convenience, data are already provided as extracted .tif in the data folder data/spatial/raster/l@n/. Hence, the code snippet below is just shown for learning purposes, no need to run it.

#get the list of all files in the l@n directory with the command -dir-, which takes as argument 
# the path to the directory.
list_of_files=dir(paste0(dir_data,
                         "spatial/raster/l@n/"))
#identify the index of the 2017 monthly files
index_of_2017_files=grep("npp_2017",
                         dir(paste0(dir_data,
                                    "spatial/raster/l@n/")))
# select in the list of all file the 2017 monthly files thanks to the index
files_2017=list_of_files[index_of_2017_files]

for(monthly_file in files_2017){
  untar(paste0(dir_data,
               "spatial/raster/l@n/",
               monthly_file), # the path to the file of interest
        compressed="gzip", # specify it was zipped with gzip
        exdir=paste0(dir_data,
                     "spatial/raster/l@n/")) # specify the exdir
}

The 12 monthly lights at night .tif files are loaded with the raster::raster function, cropped to the El Salvador’s boundaries with raster::crop and stacked in one object withraster::stack. Note the use of the for loop function to loop over the 12 monthly files.

# we get the list of the  files in the directory
list_of_files=dir(paste0(dir_data,
                         "spatial/raster/l@n/")) 

# we get the list of lights at night files labelled "avg_rade9h"
monthly_file_light=list_of_files[grepl("avg_rade9h", 
                                       list_of_files, 
                                       fixed=T)]

# create a vector of character to name the raster as file_1, file_2, etc.
names_file=paste0("file_",1:12)

# we loop over the 12 monthly files in order to
for(i in 1:length(monthly_file_light)){
  
  # load each i monthly layer
  r=raster::raster(paste0(dir_data,
                          "spatial/raster/l@n/",
                          monthly_file_light[i])) 
  # crop each i monthly layer
  r=raster::crop(r,
                 SLV_adm0)
  
  # assign each i cropped monthly layer to a name
  assign(names_file[i], # assign it to an object name
         r)
}

# stack all the layers
lights_all=raster::stack(lapply(names_file,FUN=get)) 

Let us have a look at the object lights_all:

class(lights_all)[1]
## [1] "RasterStack"
dim(lights_all)
## [1] 310 587  12
raster::res(lights_all)
## [1] 0.004166667 0.004166667
raster::extent(lights_all)
## class       : Extent 
## xmin        : -90.13125 
## xmax        : -87.68542 
## ymin        : 13.15625 
## ymax        : 14.44792
raster::crs(lights_all)
## CRS arguments:
##  +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0

This a RasterStack object, i.e. a stack of raster where each layer is aligned one above the other. It has 310 rows, 587 columns, 12 layers (1 per month) for a total of 181,970 cells. The resolution is provided in decimal degrees: 0.004166667 degrees by 0.004166667 degrees pixels.

Let us plot the 12 monthly layers for inspection.

raster::plot(lights_all,
             main=c("jan","feb","mar","apr","may","jun","jul","aug","sep","oct","nov","dec"))
2017 monthly average lights at night 
(average DNB radiance)

Figure 9.3: 2017 monthly average lights at night (average DNB radiance)

The maps in 9.3 show there are variations over the year.

9.4.2 Precipitation

The CHIRPS precipitation data are provided as a .nc file. A .nc file is a collection of raster files, similar to a raster stack. Data are loaded as a brick thanks to the raster::brick function, similar to the raster stack function8.

chirps=raster::brick(paste0(dir_data,
                            "spatial/raster/chirps/chirps-v2.0.annual.nc"),
                     band=1:38)
print(chirps)
## File /home/rstudio/data/spatial/raster/chirps/chirps-v2.0.annual.nc (NC_FORMAT_NETCDF4):
## 
##      1 variables (excluding dimension variables):
##         float precip[longitude,latitude,time]   (Chunking: [800,223,4])  (Compression: level 5)
##             units: mm/year
##             standard_name: convective precipitation rate
##             long_name: Climate Hazards group InfraRed Precipitation with Stations
##             time_step: year
##             missing_value: -9999
##             _FillValue: -9999
##             geostatial_lat_min: -50
##             geostatial_lat_max: 50
##             geostatial_lon_min: -180
##             geostatial_lon_max: 180
## 
##      3 dimensions:
##         longitude  Size:7200
##             units: degrees_east
##             standard_name: longitude
##             long_name: longitude
##             axis: X
##         latitude  Size:2000
##             units: degrees_north
##             standard_name: latitude
##             long_name: latitude
##             axis: Y
##         time  Size:38
##             units: days since 1980-1-1 0:0:0
##             standard_name: time
##             calendar: gregorian
##             axis: T
## 
##     15 global attributes:
##         Conventions: CF-1.6
##         title: CHIRPS Version 2.0
##         history: created by Climate Hazards Group
##         version: Version 2.0
##         date_created: 2019-01-30
##         creator_name: Pete Peterson
##         creator_email: pete@geog.ucsb.edu
##         institution: Climate Hazards Group.  University of California at Santa Barbara
##         documentation: http://pubs.usgs.gov/ds/832/
##         reference: Funk, C.C., Peterson, P.J., Landsfeld, M.F., Pedreros, D.H., Verdin, J.P., Rowland, J.D., Romero, B.E., Husak, G.J., Michaelsen, J.C., and Verdin, A.P., 2014, A quasi-global precipitation time series for drought monitoring: U.S. Geological Survey Data Series 832, 4 p., http://dx.doi.org/110.3133/ds832. 
##         comments:  time variable denotes the first day of the given year.
##         acknowledgements: The Climate Hazards Group InfraRed Precipitation with Stations development process was carried out through U.S. Geological Survey (USGS) cooperative agreement #G09AC000001 "Monitoring and Forecasting Climate, Water and Land Use for Food Production in the Developing World" with funding from: U.S. Agency for International Development Office of Food for Peace, award #AID-FFP-P-10-00002 for "Famine Early Warning Systems Network Support," the National Aeronautics and Space Administration Applied Sciences Program, Decisions award #NN10AN26I for "A Land Data Assimilation System for Famine Early Warning," SERVIR award #NNH12AU22I for "A Long Time-Series Indicator of Agricultural Drought for the Greater Horn of Africa," The National Oceanic and Atmospheric Administration award NA11OAR4310151 for "A Global Standardized Precipitation Index supporting the US Drought Portal and the Famine Early Warning System Network," and the USGS Land Change Science Program.
##         ftp_url: ftp://chg-ftpout.geog.ucsb.edu/pub/org/chg/products/CHIRPS-latest/
##         website: http://chg.geog.ucsb.edu/data/chirps/index.html
##         faq: http://chg-wiki.geog.ucsb.edu/wiki/CHIRPS_FAQ

The raster chirps provides annual cumulative precipitation for the globe over 38 years (38 bands), from 1981 to 2018 at a resolution of 0.05 lon/lat decimal degree.

Next, the data is cropped to the El Salvador’s boundaries and all years until 2017 are selected (data are provided up to Jan 2018 in the file):

# crop the data
chirps_ev=raster::crop(chirps,
                       SLV_adm0)
time = raster::getZ(chirps_ev) # get the time index

index_of_1981_2017 = which(dplyr::between(time,
                                   as.Date("1981-01-01"),
                                   as.Date("2017-01-01")))
index_of_2017 = which(time==as.Date("2017-01-01"))

# subset the CHIRPS on 1981-2017 (take out 2018)
chirps_ev_1981_2017=raster::subset(chirps_ev,
                                   index_of_1981_2017)

9.4.3 Soil degradation

The soil degradation data were obtained via the QGIS plugin tool “TRENDS.EARTH”9.

Three main soil-degradation indexes are provided in the TRENDS.EARTH:

  • Soil carbon stock degradation,
  • Soil Land use change degradation,
  • Soil Productivity degradation:
    • trajectory
    • performance
    • state

Data for each of the above are provided in a series of multi-layers .tif files. The .json file provides the information about each layer.

soil_deg_carb_path="spatial/raster/soil_deg_carb/soil_deg_carb.json"
soil_deg_carb_json=RJSONIO::fromJSON(paste0(dir_data,
                                            soil_deg_carb_path)) 

grep("degradation",
     lapply(lapply(soil_deg_carb_json$bands,
                   unlist),
            cbind))
## [1] 1
soil_deg_luc_json_path="spatial/raster/soil_deg_luc/soil_deg_luc.json"
soil_deg_luc_json=RJSONIO::fromJSON(paste0(dir_data,
                                           soil_deg_luc_json_path)) 
lapply(lapply(soil_deg_luc_json$bands,
              unlist),
       cbind)
## [[1]]
##                        [,1]                      
## add_to_map             "TRUE"                    
## metadata.year_baseline "1992"                    
## metadata.year_target   "2015"                    
## name                   "Land cover (degradation)"
## no_data_value          "-32768"                  
## 
## [[2]]
##               [,1]                      
## add_to_map    "FALSE"                   
## metadata.year "1992"                    
## name          "Land cover (ESA classes)"
## no_data_value "-32768"                  
## 
## [[3]]
##               [,1]                      
## add_to_map    "FALSE"                   
## metadata.year "2015"                    
## name          "Land cover (ESA classes)"
## no_data_value "-32768"                  
## 
## [[4]]
##                        [,1]                    
## add_to_map             "TRUE"                  
## metadata.year_baseline "1992"                  
## metadata.year_target   "2015"                  
## name                   "Land cover transitions"
## no_data_value          "-32768"                
## 
## [[5]]
##               [,1]                  
## add_to_map    "TRUE"                
## metadata.year "1992"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[6]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "1993"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[7]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "1994"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[8]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "1995"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[9]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "1996"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[10]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "1997"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[11]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "1998"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[12]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "1999"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[13]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2000"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[14]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2001"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[15]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2002"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[16]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2003"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[17]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2004"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[18]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2005"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[19]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2006"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[20]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2007"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[21]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2008"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[22]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2009"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[23]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2010"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[24]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2011"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[25]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2012"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[26]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2013"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[27]]
##               [,1]                  
## add_to_map    "FALSE"               
## metadata.year "2014"                
## name          "Land cover (7 class)"
## no_data_value "-32768"              
## 
## [[28]]
##               [,1]                  
## add_to_map    "TRUE"                
## metadata.year "2015"                
## name          "Land cover (7 class)"
## no_data_value "-32768"
grep("degradation",
     lapply(lapply(soil_deg_luc_json$bands,
                   unlist),
            cbind))
## [1] 1
soil_deg_prod_json_path="spatial/raster/soil_deg_prod/soil_deg_prod.json"
soil_deg_prod_json=RJSONIO::fromJSON(paste0(dir_data,
                                            soil_deg_prod_json_path)) 

grep("degradation",
     lapply(lapply(soil_deg_prod_json$bands,
                   unlist),
            cbind))
## [1] 4 7

For the carbon and land use change files, Band 1 provides the soil degradation index. For the productivity file, the index values on the trajectory, performance and state are stored in band 2, 4 and 7.

soil_deg_carb_path="spatial/raster/soil_deg_carb/soil_deg_carb.tif"
soil_deg_carb=raster::raster(paste0(dir_data,
                                    soil_deg_carb_path),
                             band=1) # soil degradation, change in the carbon method


soil_deg_luc_path="spatial/raster/soil_deg_luc/soil_deg_luc.tif"
soil_deg_luc=raster::raster(paste0(dir_data,
                                   soil_deg_luc_path),
                            band=1) # soil degradation, land use change


soil_deg_prod_trj_path="spatial/raster/soil_deg_prod/soil_deg_prod.tif"
soil_deg_prod_trj=raster::raster(paste0(dir_data,
                                        soil_deg_prod_trj_path),
                                 band=2) # soil degradation, productivity trajectory deg
soil_deg_prod_prf_path="spatial/raster/soil_deg_prod/soil_deg_prod.tif"
soil_deg_prod_prf=raster::raster(paste0(dir_data,
                                        soil_deg_prod_prf_path),
                                 band=4) # soil degradation, productivity performance deg

soil_deg_prod_stt_path="spatial/raster/soil_deg_prod/soil_deg_prod.tif"
soil_deg_prod_stt=raster::raster(paste0(dir_data,
                                        soil_deg_prod_stt_path),
                                 band=7) # soil degradation, productivity state deg

9.4.4 Vegetation greenness

Vegetation greenness data, as measured by normalized difference vegetation index, are provided on a dekadal basis i.e. every 10 days. Two data products are available:

  • The smoothed dekadal NDVI value for 2017
  • The median dekadal NDVI value over the period 2003-2017.

The first step is to unzip the data with the following code snippet. For sake of convenience, the data are provided as unzipped .tif in the data pack, but the code to unzip the files is provided below.

#get the list of all files in the ndvi directory with the command -dir-
list_of_files=dir(paste0(dir_data,
                         "spatial/raster/ndvi/"))

# identify the index of the 2017 monthly files and norma
index_of_2017_std_files=grep("17|stm",
                             dir(paste0(dir_data,
                                        "spatial/raster/ndvi/")))

# select in the list of all file the 2017 monthly files thanks to the index
files_of_interest=list_of_files[index_of_2017_std_files]

for(file in files_of_interest){
  untar(paste0(dir_data,
               "spatial/raster/ndvi/",
               file), # the path to the file of interest
        compressed="gzip", 
        exdir=paste0(dir_data,
                     "spatial/raster/ndvi/")) 
}
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca01stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca01stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca02stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca02stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca03stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca03stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca04stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca04stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca05stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca05stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca06stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca06stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca07stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca07stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca08stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca08stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca09stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca09stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca10stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca10stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca11stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca11stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca12stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca12stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca13stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca13stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca14stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca14stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca15stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca15stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca16stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca16stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1701.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1701.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1702.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1702.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1703.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1703.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1704.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1704.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1705.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1705.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1706.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1706.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1707.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1707.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1708.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1708.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1709.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1709.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1710.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1710.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1711.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1711.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1712.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1712.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1713.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1713.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1714.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1714.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1715.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1715.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1716.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1716.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1717.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1717.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1718.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1718.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1719.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1719.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1720.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1720.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1721.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1721.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1722.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1722.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1723.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1723.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1724.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1724.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1725.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1725.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1726.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1726.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1727.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1727.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1728.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1728.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1729.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1729.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1730.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1730.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1731.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1731.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1732.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1732.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1733.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1733.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1734.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1734.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1735.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1735.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1736.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca1736.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca17stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca17stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca18stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca18stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca19stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca19stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca20stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca20stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca21stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca21stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca22stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca22stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca23stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca23stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca24stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca24stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca25stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca25stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca26stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca26stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca27stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca27stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca28stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca28stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca29stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca29stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca30stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca30stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca31stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca31stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca32stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca32stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca33stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca33stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca34stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca34stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca35stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca35stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca36stm.tfw' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## ca36stm.tif' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## cam1701.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## cam1702.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## cam1703.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## cam1704.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## cam1705.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## cam1706.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## cam1707.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## cam1708.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## cam1709.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## cam1710.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## cam1711.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## cam1712.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## camstm01.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## camstm02.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## camstm03.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## camstm04.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## camstm05.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## camstm06.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## camstm07.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## camstm08.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## camstm09.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## camstm10.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## camstm11.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2
## Warning in untar(paste0(dir_data, "spatial/raster/ndvi/", file), compressed
## = "gzip", : '/bin/tar -xf '/home/rstudio/data/spatial/raster/ndvi/
## camstm12.zip' -C '~/data/spatial/raster/ndvi/'' returned error code 2

The data are then loaded. Note the use of the function grep and the regular expressions to flexibly select the appropriate set of files:

  • "ca17[[:digit:]]{2}.tif" means: “ca17”, followed by a suite of 2 digits, followed by “.tif” -> these are the 2017 files
  • "ca[[:digit:]]{2}stm.tif" means: “ca”, followed by a suite of 2 digits, followed by “stm.tif” -> these are the median files
# create a vector of character to name the raster as ndvi_1, ndvi_2, etc.
names_ndvi_17=paste0("ndvi_",1:36)
names_ndvi_norm=paste0("ndvi_norm_",1:36)

# List of the 36 dekadal file for 2017 thanks to the command -grep- and the regular expression
list_of_files=dir(paste0(dir_data,
                         "spatial/raster/ndvi/"))

files_17=list_of_files[grep("ca17[[:digit:]]{2}.tif",
                            list_of_files)]
# and the list of 36 ones for the 2003-17 median
files_norm=list_of_files[grep("ca[[:digit:]]{2}stm.tif",
                              list_of_files)]

# loop over the 36 dekadale 2017 and normal files in order to laod them and to crop them
for(i in 1:length(files_17)){
  
  # 1) the 2017 data:
  # load each i dekadale layer
  r=raster::raster(paste0(dir_data,
                          "spatial/raster/ndvi/",
                          files_17[i])) 
  # crop each i monthly layer
  r=raster::crop(r,
                 SLV_adm0)
  
  # assign each i cropped monthly layer to a name
  assign(names_ndvi_17[i], # assign it to an object name
         r)
  
  # 2) the median data:
  # load each i dekadale layer
  r=raster::raster(paste0(dir_data,
                          "spatial/raster/ndvi/",
                          files_norm[i])) 
  # crop each i monthly layer
  r=raster::crop(r,
                 SLV_adm0)
  
  # assign each i cropped monthly layer to a name
  assign(names_ndvi_norm[i], # assign it to an object name
         r)
}

# stack all the layers
ndvi_17=raster::stack(lapply(names_ndvi_17,
                             FUN=get))
ndvi_norm=raster::stack(lapply(names_ndvi_norm,
                               FUN=get)) 

Rasters are then re-sampled in order to have both sets of rasters perfectly aligned:

ndvi_17_re=raster::resample(ndvi_17,
                            ndvi_norm)

9.4.5 Temperature

The 12 monthly average temperature data are cropped and stacked.

files_temp=dir(paste0(dir_data,
                      "spatial/raster/temperature/"))[grep("wc2.0_30s_tavg_",
                                                           dir(paste0(dir_data,
                                                                      "spatial/raster/temperature/")))]
month=1
for(temp_m in files_temp){
  temp=raster::raster(paste0(dir_data,
                             "spatial/raster/temperature/",temp_m))
  temp=raster::crop(temp,
                    SLV_adm0)
  
  assign(paste0("temp_",month),
         temp)
  month=month+1
}

temp_months=raster::stack(lapply(paste0("temp_",
                                       1:12),
                             FUN=get))

9.4.6 Slope

The slope data are obtained from the WorldPop dataverse as a suite of raster files, each covering a portion of the El Salvador territory. They are combined in one single raster to ease pre-processing work.

slope_files=paste0(dir_data,
                   "spatial/raster/slope/",
                   dir(paste0(dir_data,
                              "spatial/raster/slope/")))
i=1
raster_list=list()
for(file in slope_files){
  raster_f=raster::raster(file)
  raster_list[[i]]=raster_f
  i=i+1
}
slope <- do.call(raster::merge, raster_list)

slope=raster::crop(slope,
                   SLV_adm0)
raster::plot(slope,
             main="Slope (degrees)")
sp::plot(SLV_adm0,
             add=T)

9.4.7 Other rasters

The other rasters are loaded with the function raster:

dem=raster::raster(paste0(dir_data,
                          "spatial/raster/dem/srtm_dem.tif"))

dist2road_path="spatial/raster/distance2roads/slv_osm_dst_road_100m_2016.tif"
dist2road=raster::raster(paste0(dir_data,
                          dist2road_path))

dist2roadInter_path="spatial/raster/distance2intersections/slv_osm_dst_roadintersec_100m_2016.tif"
dist2roadInter=raster::raster(paste0(dir_data,
                       dist2roadInter_path))

settlements_path="spatial/raster/settlements/slv_bsgme_v0a_100m_2017.tif"
settlements=raster::raster(paste0(dir_data,
                          settlements_path))

dist2water_path="spatial/raster/distance2waterway/slv_osm_dst_waterway_100m_2016.tif"
dist2water=raster::raster(paste0(dir_data,
                          dist2water_path))
gpw_path="spatial/raster/GPW/gpw_v4_population_count_adjusted_to_2015_unwpp_country_totals_rev11_2015_30_sec.tif"
gpw=raster::raster(paste0(dir_data,gpw_path)) # 2015 population count
gpw_es=raster::crop(gpw,
                    SLV_adm0)

lc_path="spatial/raster/CCILC/ESACCI-LC-L4-LCCS-Map-300m-P1Y-2015-v2.0.7_sv.tif"
lc=raster::raster(paste0(dir_data,lc_path)) # land class

We can plot them for quality controls. Fig 9.4 shows the the digital elevation model, i.e. the altitude across the country.

raster::plot(dem)
sp::plot(SLV_adm0,
             add=T)
Digital Elevation Model

Figure 9.4: Digital Elevation Model

Fig 9.5 shows the population density data. When examining the GPW data, one can identify some geometric forms corresponding to the enumeration areas of the census.

raster::plot(gpw_es)
sp::plot(SLV_adm0,
             add=T)
Population density data (GPW)

Figure 9.5: Population density data (GPW)

9.5 Rasters calculation

Lights at night is expected to be a good predictor of economic activity (Henderson, Storeygard, and Weil 2012). However, it is a priori not clear which summary statistics will best predict the three development indicators. For example, we cannot establish with certainty whether annual average lights at night or annual minimum average monthly lights at night is a better predictor of poverty.

This holds true for the other raster data as well.

9.5.1 Lights at night

The median and standard deviation of the lights at night data is computed for each map pixel:

lights_med=raster::calc(lights_all,
                        fun = median,
                        na.rm = T)
lights_sd=raster::calc(lights_all,
                       fun = sd,
                       na.rm = T)
lights_summaries=raster::stack(lights_med,
                               lights_sd)

names(lights_summaries)=c("lights_med",
                          "lights_sd")

Results are plotted on Fig ??.

raster::plot(lights_summaries,
             main=c("Median",
                    "Standard Deviation"))
sp::plot(SLV_adm0,
             add=T)

9.5.2 Precipitation

Again, as it is not clear a priori which summary statistics of precipitation is best at predicting the development indicators, a series of summary statistics is computed for each pixel of the map:

  • Two precipitation summaries related to short-term weather:
    • the 2017 annual cumulative precipitation
    • the 2017 annual cumulative precipitation as percentage of normal
  • Two precipitation summaries related to climate:
    • the median annual cumulative precipitation over the 38 years period
    • the standard deviation in precipitation between years
# compute summaries
chirps_ev_2017=raster::subset(chirps_ev,
                              index_of_2017) # 2017

chirps_ev_med=raster::calc(chirps_ev_1981_2017,
                           fun = median,
                           na.rm = T)  #medi


chirps_summaries=raster::stack(chirps_ev_2017,
                               chirps_ev_med)
names(chirps_summaries)=c("chirps_ev_2017",
                          "chirps_ev_med")

For quality control, let us plot the various summaries.

raster::plot(chirps_summaries,
             main=c("2017 precipitation",
                    "2017 precipitation as percentage of normal",
                    "1981-2017 median",
                    "1981-2017 standard deviation"))

9.5.3 Vegetation greenness

Various statistics are computed on the dekadal NDVI and its deviation from the 2003-17 normal (median pixel value over the period).

ndvi_17_perc=(ndvi_17_re/ndvi_norm)*100

# annual NDVI values
ndvi_17_sum=raster::calc(ndvi_17_re,
                         fun = sum,
                         na.rm = T)
ndvi_17_summaries=raster::stack(ndvi_17_sum)

names(ndvi_17_summaries)=c("ndvi_17_sum")

# NDVI as a percentage of the 2003-17 norm
ndvi_17_perc_sum=raster::calc(ndvi_17_perc,
                              fun = sum,
                              na.rm = T)

ndvi_17_perc_summaries=raster::stack(ndvi_17_perc_sum)

names(ndvi_17_perc_summaries)=c("ndvi_17_perc_sum")

9.5.4 Soil degradation

The missing data are stored in the raw rasters as -32768. They are turned into NA. Next, the soil degradation data are simplified by creating 1 binary raster for the three soil degradation proxies: the soil is degraded or not.

#  Soil carbon stock degradation
# replace the NA 
soil_deg_carb_val=raster::getValues(soil_deg_carb)
index_na=which(soil_deg_carb_val==-32768)

soil_deg_carb_val[index_na]=NA #set to NA the cell identified with original value of -32768

# and binarise for degradation
index_deg=which(soil_deg_carb_val<0)
soil_deg_carb_deg=soil_deg_carb_val
soil_deg_carb_deg[index_deg]=1
soil_deg_carb_deg[-index_deg]=0
soil_deg_carb_deg[index_na]=NA

soil_deg_carb_deg=raster::setValues(soil_deg_carb,
                                    soil_deg_carb_deg)  

As the code is quite lengthy, it is only shown for the carbon stock soil degradation.

#  Soil Land use change degradation 
soil_deg_luc_val=raster::getValues(soil_deg_luc)
index_deg=which(soil_deg_luc_val<0)

soil_deg_luc_deg=soil_deg_luc_val
soil_deg_luc_deg[index_deg]=1
soil_deg_luc_deg[-index_deg]=0

soil_deg_luc_deg=raster::setValues(soil_deg_luc,
                                   soil_deg_luc_deg)  

#  Soil Productivity degradation 

# trajectory 
soil_deg_prod_trj_val=raster::getValues(soil_deg_prod_trj)
index_na=which(soil_deg_prod_trj_val==-32768)
index_deg=which(soil_deg_prod_trj_val<0)

soil_deg_prod_trj_val[index_na]=NA

soil_deg_prod_trj_deg=soil_deg_prod_trj_val
soil_deg_prod_trj_deg[index_deg]=1
soil_deg_prod_trj_deg[-index_deg]=0
soil_deg_prod_trj_deg[index_na]=NA

soil_deg_prod_trj_deg=raster::setValues(soil_deg_prod_trj,
                                        soil_deg_prod_trj_deg)  

# performance
soil_deg_prod_prf_val=raster::getValues(soil_deg_prod_prf)
index_na=which(soil_deg_prod_prf_val==-32768)
index_deg=which(soil_deg_prod_prf_val<0)

soil_deg_prod_prf_val[index_na]=NA

soil_deg_prod_prf_deg=soil_deg_prod_prf_val
soil_deg_prod_prf_deg[index_deg]=1
soil_deg_prod_prf_deg[-index_deg]=0
soil_deg_prod_prf_deg[index_na]=NA

soil_deg_prod_prf_deg=raster::setValues(soil_deg_prod_prf,
                                        soil_deg_prod_prf_deg)  

# state 
soil_deg_prod_stt_val=raster::getValues(soil_deg_prod_stt)
index_na=which(soil_deg_prod_stt_val==-32768)
index_deg=which(soil_deg_prod_stt_val<0)

soil_deg_prod_stt_val[index_na]=NA

soil_deg_prod_stt_deg=soil_deg_prod_stt_val
soil_deg_prod_stt_deg[index_deg]=1
soil_deg_prod_stt_deg[-index_deg]=0
soil_deg_prod_stt_deg[index_na]=NA

soil_deg_prod_stt_deg=raster::setValues(soil_deg_prod_stt,
                                        soil_deg_prod_stt_deg)  

Finally, the SDG indicator (15.3) for soil degradation indicator is computed. It is based on the one-out-all-out principle i.e. if a pixel is considered as soil degraded in any of the three soil degradation layers, then it is considered as degraded in the SDG indicator.

The first step consists of aligning the raster, i.e. making sure that the pixels of each layer are perfectly overlaid.

# check the rasters' extents are the same
all(raster::extent(soil_deg_luc_deg)==raster::extent(soil_deg_prod_stt_deg),
    raster::extent(soil_deg_carb_deg)==raster::extent(soil_deg_prod_stt_deg))
## [1] FALSE

As some of the layers have a slightly different extent, the values of each layer is re-sampled in the same raster grid.

# resample the raster "soil_deg_luc_deg" to the same resolution and extent than the raster "soil_deg_prod_stt_deg"   
soil_deg_luc_deg_re=raster::resample(soil_deg_luc_deg,
                                     soil_deg_prod_stt_deg) 

# we need to round the value as the resampling takes averages whil we want 0/1
soil_deg_luc_deg_re_val=raster::getValues(soil_deg_luc_deg_re)
soil_deg_luc_deg_re_val_r=round(soil_deg_luc_deg_re_val) 

soil_deg_luc_deg_re_r=raster::setValues(soil_deg_luc_deg_re,
                                        soil_deg_luc_deg_re_val_r)

raster::plot(soil_deg_luc_deg_re_r)
sp::plot(SLV_adm0,
             add=T)

The same is done with the raster soil_deg_carb_deg (not shown).

The resulting rasters are stacked and the SDG index computed.

# stack the raster ####
soil_deg=raster::stack(soil_deg_carb_deg_re_r,
                       soil_deg_luc_deg_re_r,
                       soil_deg_prod_stt_deg,
                       soil_deg_prod_prf_deg,
                       soil_deg_prod_trj_deg)
names(soil_deg)=c("soil_carb",
                  "soil_luc",
                  "soil_prod_state",
                  "soil_prod_prf",
                  "soil_prod_trj")

#  SDG 15.3.1 soil degradation index (one out all out principle) ####
soil_deg_SDG_index=raster::calc(soil_deg,
                                sum, 
                                na.rm=T)
soil_deg_SDG_index=raster::calc(soil_deg_SDG_index,
                                fun=function(x) {x[x>0] = 1; return(x)}
)

And here is the map of soil degradation according to the SDG indicator 15.3:

raster::plot(soil_deg_SDG_index)
sp::plot(SLV_adm0,
             add=T)
Soil Degradation, SDG indicator 15.3

Figure 9.6: Soil Degradation, SDG indicator 15.3

9.5.5 Temperature

Let us calculate the median and standard deviation of temperature variations for the year 2017.

temp_median=raster::calc(temp_months,
                         fun = median,
                         na.rm = T)

temp_summaries=raster::stack(temp_median)
names(temp_summaries)=c("temp_median")
raster::plot(temp_median)
sp::plot(SLV_adm0,
             add=T)

9.6 Distance calculation

Distance to public services and businesses may be a good indicator of poverty and other development indicators: remote areas tend to be poorer and face lower level of literacy. Furthermore, the distance to some natural features, such as distance to the coast or distance to forested areas may be correlated with local economic development.

We compute below the distance to:

  • Public services:
    • hospitals (source: IDB)
    • health centres (source: IDB)
    • private, public and schools in general (source: IDB),
    • government offices (education and health excluded) (source: OSM)
    • any of the above
  • Businessess:
    • Banks or ATMs (source: OSM)
    • any businesses, Bank or ATM included (source: OSM)
  • Natural features:
    • water bodies (land class from ESA)
    • cropland (land class from ESA)
    • forested areas (land class from ESA)
    • urban areas (land class from ESA)

9.6.1 Distance to public services

The first step is to create a raster which is used as a canvas to compute distances. Distances will be computed from each raster cell centre to the closest point of interest (e.g. the closest school). In order to have a relatively good precision of the distance estimates, we will create a raster with a resolution of 100 metres. Note the use of the projected coordinates system from the segmento_sh shapefile.

# create a raster for the computation of distances
raster_dist=raster::raster(raster::extent(segmento_sh),
                   crs=segmento_sh@proj4string,
                   res=c(100,100))
print(segmento_sh@proj4string)
## CRS arguments:
##  +proj=lcc +lat_1=13.316666 +lat_2=14.25 +lat_0=13.783333
## +lon_0=-89 +x_0=500000 +y_0=295809.184 +datum=NAD27 +units=m
## +no_defs +ellps=clrk66
## +nadgrids=@conus,@alaska,@ntv2_0.gsb,@ntv1_can.dat

We transform the CRS of the hospitals_sp shapefile to the corresponding projected CRS:

# Project the hospital shapefile ####
hospitals_proj=sp::spTransform(hospitals_sp,
                               segmento_sh@proj4string)

We can now compute the distance from each cell centre of the raster, raster_dist, to the projected shapefile of hospitals hospitals_proj with the function raster::distanceFromPoints10.

dist2hosp=raster::distanceFromPoints(raster_dist,
                                     hospitals_proj)

The distance map is plotted for inspection:

raster::plot(dist2hosp)
sp::plot(SLV_adm0_proj,
             add=T)

The colorscale indicates the distance to hospitals.

The same can be repeated for the other public services (not shown).

For the OSM data, 2 additional steps are required:

  • The data is filtered to select only the public amenities defined as:
    • Town hall
    • Post office
    • Court house
    • Police station
    • Prison
    • Fire station
    • Community centre
    • Public building (a loose OSM definition)
  • The polygon data is rasterized to allow for distance calculation with the function raster::distance
    • a buffer of 25 metres needs to be included as the function raster::rasterize considers a polygon to cross a cell only if it covers the cell centre
    • the function raster::distance computes the distance from all pixels with a NA value to the nearest pixel which is not NA.
# distance to public amenities (health and educ excluded) ####
amenities=c("townhall","post_office","courthouse","police",
            "prison","public_building","fire_station","community_centre")
# poly
poi_poly_pubserv=as(subset(poi_poly_sh,
                       amenity%in%amenities),
                "SpatialPolygons")
poi_poly_pubserv_proj=sp::spTransform(poi_poly_pubserv,
                                  segmento_sh@proj4string)

poi_poly_pubserv_proj_buffer=rgeos::gBuffer(poi_poly_pubserv_proj,width=50) 

poly_pubserv_proj_r=raster::rasterize(poi_poly_pubserv_proj_buffer,
                                  y=raster_dist)
dist2poly_pubserv=raster::distance(poly_pubserv_proj_r)

# points
poi_points_pubserv=as(subset(poi_points_sh,
                         amenity%in%amenities),
                  "SpatialPoints")
poi_points_pubserv_proj=sp::spTransform(poi_points_pubserv,
                                    segmento_sh@proj4string)
dist2points_pubserv=raster::distanceFromPoints(raster_dist,
                                           poi_points_pubserv_proj)

dist2pubamen=min(dist2points_pubserv,
             dist2poly_pubserv)

Lastly, we compute the distance to any public services for each pixel taking the minimum distance of the 4 distance maps:

# distance to all public services #####
dist2allpubserv=min(dist2pubamen,
                    dist2schools,
                    dist2health,
                    dist2hosp)

9.6.2 Distance to businesses

The same approach is adopted for the distance to businesses. We start with the distance to financial services access points defined as banks and ATM.

# distance to FAPS ####
amenities=c("bank","atm")
poi_poly_fin=as(subset(poi_poly_sh,
                       amenity%in%amenities),
                "SpatialPolygons")
poi_poly_fin_proj=sp::spTransform(poi_poly_fin,
                                 segmento_sh@proj4string)
poi_poly_fin_proj_buffer=rgeos::gBuffer(poi_poly_fin_proj,width=50) 

poly_fin_proj_r=raster::rasterize(poi_poly_fin_proj_buffer,
                                  y=raster_dist)
dist2poly_fin=raster::distance(poly_fin_proj_r)


poi_points_fin=as(subset(poi_points_sh,
                         amenity%in%amenities),
                  "SpatialPoints")
poi_points_fin_proj=sp::spTransform(poi_points_fin,
                                 segmento_sh@proj4string)
dist2points_fin=raster::distanceFromPoints(raster_dist,
                                             poi_points_fin_proj)
dist2fin=min(dist2points_fin,
             dist2poly_fin)

We proceed with the other businesses. These are the OpenStreetMap businesses categories considered on top of banks and ATM: * Market place * Food court * Restaurant * Fast food * Cafe * Bar * Ice cream shop * Pharmacy * Internet cafe * Cinema * Fuel

# businesses in amenities
amenities=c("marketplace","pharmacy","restaurant","fast_food","cafe","bar","ice_cream","food_court",
            "internet_cafe", "cinema","fuel","atm","bank")
# poly
poi_poly_eco=as(subset(poi_poly_sh,
                       amenity%in%amenities),
                "SpatialPolygons")
poi_poly_eco_proj=sp::spTransform(poi_poly_eco,
                                  segmento_sh@proj4string)
poi_poly_eco_proj_buffer=rgeos::gBuffer(poi_poly_eco_proj,width=50) 

poly_eco_proj_r=raster::rasterize(poi_poly_eco_proj_buffer,
                                  y=raster_dist)
dist2poly_eco=raster::distance(poly_eco_proj_r)

# points
poi_points_eco=as(subset(poi_points_sh,
                         amenity%in%amenities),
                  "SpatialPoints")
poi_points_eco_proj=sp::spTransform(poi_points_eco,
                                    segmento_sh@proj4string)
dist2points_eco=raster::distanceFromPoints(raster_dist,
                                           poi_points_eco_proj)

In addition to being listed in the amenity field of the OSM data, shop locations are also tagged separately. Given that shops are also businesses, they are taken into account in the distance calculations.

# businesses as shop
# poly
poi_poly_shop=as(subset(poi_poly_sh,
                        is.na(shop)==F),
                 "SpatialPolygons")
poi_poly_shop_proj=sp::spTransform(poi_poly_shop,
                                  segmento_sh@proj4string)
poi_poly_shop_proj_buffer=rgeos::gBuffer(poi_poly_shop_proj,width=50)

poly_shop_proj_r=raster::rasterize(poi_poly_shop_proj_buffer,
                                  y=raster_dist)
dist2poly_shop=raster::distance(poly_shop_proj_r)

# points
poi_points_shop=as(subset(poi_points_sh,
                                   is.na(shop)==F),
                            "SpatialPoints")
poi_points_shop_proj=sp::spTransform(poi_points_shop,
                                    segmento_sh@proj4string)
dist2points_shop=raster::distanceFromPoints(raster_dist,
                                           poi_points_shop_proj)

Finally, we gather all business-specific distance maps into one general Distance to a businesses map.

# collect all businesses
dist2biz=min(dist2points_shop,
              dist2poly_shop,
              dist2points_eco,
              dist2poly_eco)

9.6.3 Distance to natural features

Next we compute the distance to natural features such as the coastline, urban areas, forested areas and croplands.

We start by cropping the world coastline to El-Salvadorian boundaries. Once this operation is completed, we project the map and create the necessary buffer.

# distance to coastline
coastline_c=raster::crop(coastline,
                         sp::bbox(SLV_adm0))
coastline_proj=sp::spTransform(coastline_c,
                               segmento_sh@proj4string)

coastline_proj_buffer=rgeos::gBuffer(coastline_proj,width=50) 

coastline_proj_r=raster::rasterize(coastline_proj_buffer,
                                  y=raster_dist)
dist2coast=raster::distance(coastline_proj_r)

For the distance to the urban areas, forested areas, the croplands, and water bodies we use to NA pixels which are not urban, tree cover, cropland or water bodies, respectively. We then we use the function raster::distance which computes the distance from each NA pixel to the closest non NA pixel.

# distance to urban area ####
lc_urban=lc
lc_urban=raster::calc(lc_urban,
                        fun=function(x) {ifelse(x==190,1,NA)})
lc_urban_proj=raster::projectRaster(lc_urban,
                                    crs=segmento_sh@proj4string)
dist2urban=raster::distance(lc_urban_proj)

# distance to forest ####
lc_tree=lc
lc_tree=raster::calc(lc_tree,
                      fun=function(x) {ifelse(x%in%c(50,60),1,NA)})
lc_tree_proj=raster::projectRaster(lc_tree,
                                    crs=segmento_sh@proj4string)
dist2tree=raster::distance(lc_tree_proj)

# distance to crop ####
lc_crop=lc
lc_crop=raster::calc(lc_crop,
                     fun=function(x) {ifelse(x%in%c(10,20,30,40),1,NA)})
lc_crop_proj=raster::projectRaster(lc_crop,
                                   crs=segmento_sh@proj4string)
dist2crop=raster::distance(lc_crop_proj)


# distance to water ####
lc_water=lc
lc_water=raster::calc(lc_water,
                     fun=function(x) {ifelse(x%in%c(210),1,NA)})
lc_water_proj=raster::projectRaster(lc_water,
                                   crs=segmento_sh@proj4string)
dist2water=raster::distance(lc_water_proj)

9.6.4 Stack distance maps and inspect results

Once the extent and resolution have been aligned with the function raster::resample, the distance maps are stacked with the function raster::stack.

dist2coast_r=raster::resample(dist2coast,
                              dist2fin)
dist2water_r=raster::resample(dist2water,
                              dist2fin)
dist2crop_r=raster::resample(dist2crop,
                              dist2fin)
dist2tree_r=raster::resample(dist2tree,
                              dist2fin)
dist2urban_r=raster::resample(dist2urban,
                              dist2fin)
dist_stack=raster::stack(dist2schools_priv,
                 # dist2schools_pub,
                 dist2schools,
                 dist2health,
                 dist2hosp,
                 dist2pubamen,
                 dist2allpubserv,
                 dist2biz,
                 dist2fin,
                 dist2coast_r,
                 dist2water_r,
                 dist2crop_r,
                 dist2tree_r,
                 dist2urban_r)
names(dist_stack)=c("dist2schools_priv",
                    # "dist2schools_pub",
                    "dist2schools",
                    "dist2health",
                    "dist2hosp",
                    "dist2pubamen",
                    "dist2allpubserv",
                    "dist2biz",
                    "dist2fin",
                    "dist2coast_r",
                    "dist2water_r",
                    "dist2crop_r",
                    "dist2tree_r",
                    "dist2urban_r")

We can now examine the results. The next maps show the distance financial access points, businesses, private schools, public schools, (only the code for distance to financial services access points is shown).

SLV_adm0_proj=sp::spTransform(SLV_adm0,
                              segmento_sh@proj4string)
raster::plot(dist_stack$dist2fin, main="Distance to financial access points, source: OSM")
sp::plot(SLV_adm0_proj,add=T)
sp::plot(waterbodies_proj,col="blue",add=T)
Distance to financial access points

Figure 9.7: Distance to financial access points

raster::plot(dist_stack$dist2biz, main="Distance to businesses, source: OSM")
sp::plot(SLV_adm0_proj,add=T)
sp::plot(waterbodies_proj,col="blue",add=T)
Distance to financial access points

Figure 9.7: Distance to financial access points

raster::plot(dist_stack$dist2schools_priv, main="Distance to private school, source: IDB")
sp::plot(SLV_adm0_proj,add=T)
sp::plot(waterbodies_proj,col="blue",add=T)
sp::plot(waterbodies_proj,col="blue",add=T)

# raster::plot(dist_stack$dist2schools_pub, main="Distance to public schools, source: IDB")
sp::plot(SLV_adm0_proj,add=T)
sp::plot(waterbodies_proj,col="blue",add=T)
Distance to financial access points

Figure 9.7: Distance to financial access points

raster::plot(dist_stack$dist2urban_r, main="Distance to urban areas, source: IDB")
sp::plot(SLV_adm0_proj,add=T)
sp::plot(waterbodies_proj,col="blue",add=T)
Distance to financial access points

Figure 9.7: Distance to financial access points

raster::plot(dist_stack$dist2hosp, main="Distance to hospital, source: IDB")
sp::plot(SLV_adm0_proj,add=T)
sp::plot(waterbodies_proj,col="blue",add=T)
Distance to financial access points

Figure 9.7: Distance to financial access points

raster::plot(dist_stack$dist2pubamen, main="Distance to public amenities, source: OSM")
sp::plot(SLV_adm0_proj,add=T)
sp::plot(waterbodies_proj,col="blue",add=T)
Distance to financial access points

Figure 9.7: Distance to financial access points

9.7 Spatial statistics at segmento level

The raster and vector layers built above need to be matched with the EHPM survey data for them to be used in the analysis. This can be done either by:

  • identifying the centroid (i.e. the middle point) of each segmento and extracting the value of each covariate layer at the centroid of each segmento;
  • computing the spatial average, minimum or maximum (or other) of each covariate for each segmento.

Although the first approach is simpler to implement, the second one is preferred: data on poverty, income and literacy are available at the segmento level, i.e. these are area data rather than point-referenced data. Therefore, it is best to provide spatial summary statistics at the same level of spatial aggregation level rather than extracting data at the centroid of the segmento, an arbitrary location which might not be representative of the condition prevailing in the segmento.

A prerequisite for merging to the data to is to have all data defined on the same coordinates reference system.

9.7.1 Adjusting the Coordinates Reference System for raster extaction to segmento

In order to make sure all the pre-processed RS layers are set to this same CRS, the following process is adopted:

  • a list of layer is created
  • layer with a different CRS are identified
  • layer with a different CRS are re-projected
# create the list of rasters
rasters_list=list(chirps_summaries,
                  soil_deg,soil_deg_SDG_index,
                  ndvi_17_summaries,ndvi_17_perc_summaries,
                  dem,
                  gpw_es,
                  temp_summaries,
                  dist2road,
                  dist2roadInter,
                  slope,
                  settlements,
                  dist2water,
                  lights_summaries,
                  lc,
                  dist_stack)

sum(unlist(lapply(rasters_list, raster::nlayers))) # number of layers
## [1] 34
names(rasters_list)=c("chirps_summaries",
                      "soil_deg","soil_deg_SDG_index",
                      "ndvi_17_summaries","ndvi_17_perc_summaries",
                      "dem",
                      "gpw_es",
                      "temp_summaries",
                      "dist2road",
                      "dist2roadInter",
                      "slope",
                      "settlements",
                      "dist2water",
                      "lights_summaries",
                      "lc",
                      "dist_stack")


# identify the rasters with a different projection
different_proj=which(lapply(rasters_list,
                            FUN=function(x) sp::proj4string(x)==sp::proj4string(segmento_sh_wgs))==F)

# reproject these raster to the same geographic coordinates system than Segmento
rasters_list[different_proj]=lapply(rasters_list[different_proj],
                                    FUN=function(x) raster::projectRaster(x,                                                                          crs=sp::proj4string(segmento_sh_wgs)))

# turn the list items into object, i.e. replace the original raster objects with 
# the new raster objects having the correct coordinate system
for(i in 1:length(rasters_list)) {
  assign(names(rasters_list)[i],
         rasters_list[[i]])
}

The same is done for the list of vectors files (not shown), the only difference being that the function raster::projectRaster is replaced by the function sp::spTransform as a vector is used.

9.7.2 Rasters

For each raster layer, we now need to get a summary statistics for each segmento area. We will start by getting segmento area average value for each raster layer11, except for the population where we will also get the \(sum\), i.e. total population count per segmento, and land class where we will get the percentage of segmento area for a subset of classes.

The package velox rather than standard extract function from the raster package is used as it is much faster12.

rasters_extraction_list=c("chirps_summaries","soil_deg","soil_deg_SDG_index","ndvi_17_summaries","ndvi_17_perc_summaries",
                          "dem","temp_summaries","dist2road","dist2roadInter","slope","settlements","dist2water","lights_summaries","dist_stack")
seg_r_s=c() # create a empty vector to store the extracted data
START_e=Sys.time() # record the start time of the loop
for(r in rasters_extraction_list){ # loop over the list of rasters

  # create a velox object from the rastr
  r_velox=velox::velox(get(r))  # The function `get(x)` interpret the character *x* as an
  # object. In the code snippet below, it allows us to loop through the list of raster names.

  ex.mat <- r_velox$extract(segmento_sh_wgs,
                            fun=function(x) mean(x,na.rm=T),
                            small=T) # get the mean

  seg_r_s=cbind(seg_r_s,ex.mat) # add the extracted data as a new column
}
end_e=Sys.time()
print(end_e-START_e)
## Time difference of 3.201328 mins
colnames(seg_r_s)=c(names(chirps_summaries),
                    names(soil_deg),
                    "soil_deg_SDG_index",
                    names(ndvi_17_summaries),
                    names(ndvi_17_perc_summaries),
                    "dem",names(temp_summaries),
                    "dist2road","dist2roadInter","slope","settlements","dist2water",
                    names(lights_summaries),
                    names(dist_stack))

For the population raster, we get the \(sum\) per segmento:

# pop
r_velox=velox::velox(gpw_es)
ex.mat_pop <- r_velox$extract(segmento_sh_wgs,
                            fun=function(x) sum(x,na.rm=T),
                            small=T) # get the sum
colnames(ex.mat_pop)="gpw_es_TOT"

For the land cover, we extract the share of each segmento area classified as:

  • Urban areas:
    • land class: 190
  • Tree covers area13:
    • land class: 50, Tree cover broadleaved evergreen closed to open (>15% of pixel area)
    • land class: 60, Tree cover broadleaved deciduous closed to open (>15% of pixel area)
  • Cropland:
    • land class: 10, Cropland rainfed
    • land class: 20, Cropland irrigated or post-flooding
    • land class: 30, Mosaic cropland (>50% of pixel area) / natural vegetation (tree shrub herbaceous cover) (<50% of pixel area)
    • land class: 40, Mosaic natural vegetation (tree shrub herbaceous cover) (>50% of pixel area) / cropland (<50% of pixel area)
# LC ####
r_velox=velox::velox(lc)
# urban LC
ex.mat_urb <- r_velox$extract(segmento_sh_wgs,
                            fun=function(x) sum(x==190,na.rm=T)/(sum(x==190,na.rm=T)+sum(x!=190,na.rm=T)),
                            small=T) # get the sum

# Tree cover 
# 50;Tree cover  broadleaved     evergreen   closed to open (>15%);0;100;0
# 60;Tree cover  broadleaved     deciduous   closed to open (>15%);0;160;0
ex.mat_tree <- r_velox$extract(segmento_sh_wgs,
                            fun=function(x) sum(x%in%c(50,60, 100,110),na.rm=T)/(sum(x%in%c(50,60,100,110),na.rm=T)+sum(!(x%in%c(50,60,100,110)),na.rm=T)),
                            small=T) # get the sum

# crop land 
# 10;Cropland    rainfed;255;255;100
# 20;Cropland    irrigated or post-flooding;170;240;240     
# 30;Mosaic cropland (>50%) / natural vegetation (tree   shrub   herbaceous cover) (<50%);220;240;100
# 40;Mosaic natural vegetation (tree     shrub   herbaceous cover) (>50%) / cropland (<50%) ;200;200;100
ex.mat_crop <- r_velox$extract(segmento_sh_wgs,
                            fun=function(x) sum(x%in%c(10,20,30,40),na.rm=T)/(sum(x%in%c(10,20,30,40),na.rm=T)+sum(!(x%in%c(10,20,30,40)),na.rm=T)),
                            small=T) # get the sum


colnames(ex.mat_urb)="lc_urban"
colnames(ex.mat_tree)="lc_tree"
colnames(ex.mat_crop)="lc_crop"

Lastly, the extracted data are added to a data frame before merging it with the segmento SpatialPolygonDataframe by binding its columns with the function cbind.

seg_r=cbind(seg_r_s,
            ex.mat_pop,
            ex.mat_urb,
            ex.mat_tree,
            ex.mat_crop) # compile the extracted data
seg_r_df=as.data.frame(seg_r)

segmento_sh_wgs@data=cbind(segmento_sh_wgs@data,
                           seg_r_df)
# ogrDrivers()
# writeOGR(segmento_sh_wgs,
#                 paste0(dir_data,
#                        "spatial/shape/out/STPLAN_Segmentos_WGS84_wdata2.shp"),
#                 driver = "ESRI Shapefile",
#                 layer=1)
# 

9.7.2.1 Quality check

We now check that the extraction worked well and did not produced any missing values. This could be caused notably by the raster layers not providing any value for small segmentos on islands.

missing_r=apply(segmento_sh_wgs@data,2,FUN=function(x) which(is.na(x)))
missing_r=unique(c(unlist(missing_r)))

missing_SEG=segmento_sh_wgs%>%
  subset(SEG_ID%in%segmento_sh_wgs$SEG_ID[missing_r])
missing_SEG_coords=sp::coordinates(missing_SEG)
leaflet(SLV_adm0)%>%
  addPolygons()%>%
  addMarkers(missing_SEG_coords[,1],missing_SEG_coords[,2])
Observations with no data

Figure 9.8: Observations with no data

We observe on Fig. 9.8 that most missing values are in border areas. Two main reasons explain these NA:

  • A lot of raster maps produced by WorldPop and HDX used boudaries produced by the open source Global Administrative Boundaries project. Unfortunately, the El Salvador boundaries in the the GADM are incorrect.
  • Most pre-processed rasters are clipped in order to not overpass the boudaries, coastline and waterbodies. Some small segmentos at the boarder, on the coast or on the shore of Lago Ilopango are hence not covered.

The raster with missing values are listed below.

missing_r=apply(segmento_sh_wgs@data,2,FUN=function(x) which(is.na(x)))
data.frame(n_miss=unlist(lapply(missing_r,length)),
           var_miss=names(missing_r))%>%
  arrange(desc(n_miss))%>%
  filter(n_miss>0)
##    n_miss           var_miss
## 1      22          dist2road
## 2      22     dist2roadInter
## 3      22        settlements
## 4      17        temp_median
## 5      15      soil_prod_prf
## 6       6          soil_carb
## 7       1     chirps_ev_2017
## 8       1      chirps_ev_med
## 9       1           soil_luc
## 10      1    soil_prod_state
## 11      1      soil_prod_trj
## 12      1 soil_deg_SDG_index
## 13      1              slope

As their number is relatively small (a max of 22 missing values for 13 covariates), we decided to simply replace the missing value by the Canton average. For the one segmento where this did not work, we used the municipio average (code not shown).

9.7.3 Vectors

For the vectors, a choice has to be made on the type of statistics which are desirable from each vector layer. Here are the spatial summaries statistics we will compute:

  • Livelihood Zone Map: the Livelihood type of the segmento
  • Schools: private and public schools, the sum of each category and their total is computed
  • Health facilities: hospitals and three categories of health centers, the sum of each category and the sum of categories is computed
  • OpenStreetMap point of interests, number of:
    • buildings
    • points of interest
    • shops
    • amenities classified in 12 categories and their sum
    • OpenStreetMap roads, streets or path:s
    • length of roads, streets or paths classified in 9 categories and their sum
    • density of roads, streets or paths classified in 9 categories (length of roads/segmento area)
    • share of rural and urban roads in total road, street or path of the segmento.

9.7.3.1 Livelihood zones

As there is only on categorical value per area, the process is simple: the function over from the package sp is used to extract for each segmento the corresponding value of the segmento_sh_wgsSpatialPolygonDataFrame. The result of the function over is a data frame with one row per segmento of the segmento_sh_wgs. The Livelihood zone code (“LZCODE”) and Livelihood zone name (“LZNAMEEN”) is then added into the original data frame of segmento_sh_wgs.

# extract for each  segmento  the corresponding value of the segmento_sh_wgs's SpatialPolygonDataFrame
LZ_seg=sp::over(segmento_sh_wgs,
                LZ_sh)
LZ_seg=LZ_seg[,c("LZCODE","LZNAMEEN")]

# add the LZ data to segmento shape
segmento_sh_wgs@data$LZCODE=LZ_seg$LZCODE
segmento_sh_wgs@data$LZNAMEEN=LZ_seg$LZNAMEEN

Checking whether there are any missing values,

# check there are no NA
summary(is.na(segmento_sh_wgs@data$LZNAMEEN))
##    Mode   FALSE    TRUE 
## logical   12425      10

We see that there 10 segmentos with no assigned livelihood zones. They are mapped on Fig 9.9.

missing_SEG=segmento_sh_wgs%>%
  subset(SEG_ID%in%segmento_sh_wgs$SEG_ID[which(is.na(segmento_sh_wgs@data$LZNAMEEN))])

missing_SEG_coords=sp::coordinates(missing_SEG)
leaflet(SLV_adm0)%>%
  addPolygons()%>%
  addMarkers(missing_SEG_coords[,1],missing_SEG_coords[,2])
Observations with no data

Figure 9.9: Observations with no data

All the missing values are in border areas: as the livelihood zone shapefile is not perfectly aligned to the segmento one, some segmentos fall outside the livelihood zone shapefile. A simple expedient is to replace it manually after a map inspection.

missing_canton=which(is.na(segmento_sh_wgs@data$LZNAMEEN)&
                       segmento_sh_wgs@data$CANTON%in%c("AREA URBANA","GUISQUIL"))
segmento_sh_wgs@data$LZNAMEEN[missing_canton]="Fishing, Aquaculture and Tourism Zone"
segmento_sh_wgs@data$LZCODE[missing_canton]="SV06"


missing_canton=which(is.na(segmento_sh_wgs@data$LZNAMEEN)&
                       segmento_sh_wgs@data$CANTON%in%c("ISLA ZACATILLO","ISLA MEANGUERITA"))
segmento_sh_wgs@data$LZNAMEEN[missing_canton]="Basic Grain and Labor Zone"
segmento_sh_wgs@data$LZCODE[missing_canton]="SV01"

missing_canton=which(is.na(segmento_sh_wgs@data$LZNAMEEN)&
                       segmento_sh_wgs@data$CANTON=="EL JOCOTILLO")
segmento_sh_wgs@data$LZNAMEEN[missing_canton]="Sugarcane, Agro-Industry and Labor Zone"
segmento_sh_wgs@data$LZCODE[missing_canton]="SV03"

summary(is.na(segmento_sh_wgs@data$LZNAMEEN))
##    Mode   FALSE 
## logical   12435
summary(is.na(segmento_sh_wgs@data$LZCODE))
##    Mode   FALSE 
## logical   12435

9.7.3.2 Schools

Extracting the schools’ and health facilities data follows the same principle. The difference is that the data are point data instead of polygon data. The schools and health facilities datasets are hence SpatialPointsDataFrame. The SpatialPointsDataFrame are first transformed into a SpatialPoints object thanks to the function as. The function over is then run, with an additional parameter specified: fn=sum. This tells the function over to compute the sum of SpatialPoints per segmento, i.e. the number of schools per segmento. For segmentos where there are no schools, the function over yields a NA value. NA values a replaced by zeros.

# Schools 
# total sum of schools
sch_seg=sp::over(segmento_sh_wgs,
                 as(school_sh, "SpatialPoints"),
                 fn=sum,na.rm=T)
# replace the NA value
sch_seg[which(is.na(sch_seg))]=0

The same is done for public and private schools separately. The function subset is used to subset only one category.

# N public schools
sch_pub_seg=sp::over(segmento_sh_wgs,
                     as(subset(school_sh,
                               Sector!="PRIVADO"),
                        "SpatialPoints"),
                     fn=sum,na.rm=T)
sch_pub_seg[which(is.na(sch_pub_seg))]=0

# N private schools
sch_pri_seg=sp::over(segmento_sh_wgs,
                     as(subset(school_sh,
                               Sector=="PRIVADO"),
                        "SpatialPoints"),
                     fn=sum,na.rm=T)
sch_pri_seg[which(is.na(sch_pri_seg))]=0

Lastly, the newly created covariates are added the to SpatialPolyongsDataFrame segmento_sh_wgs.

# add the school data to segmento shape
segmento_sh_wgs@data$sch=sch_seg
segmento_sh_wgs@data$sch_pub=sch_pub_seg
segmento_sh_wgs@data$sch_pri=sch_pri_seg

The same process is adopted for the health facilities before adding the covariate to segmento_sh_wgs (not shown).

9.7.3.3 OpenStreetMap data

The OpenStreetMap data are provided in 3 sets of files:

  • buildings polygons (hotosm_slv_buildings_polygons.shp)
  • points of interest points (hotosm_slv_points_of_interest_points.shp)
  • points of interest polygons (hotosm_slv_points_of_interest_polygons.shp)

For the buildings data, the sum of building per segmento is computed. This provides an indication of how urban a segmento is.

build_seg=sp::over(segmento_sh_wgs,
                   as(buildings_poly_sh,
                      "SpatialPolygons"),
                   fn=sum,na.rm=T)
build_seg[which(is.na(build_seg))]=0


# add the building data to segmento shape
segmento_sh_wgs@data$building=build_seg

For the points of interests vectors (points and poly), the number of locations per segmento identified as a Shop in OSM, i.e. a place selling retail products or services, is calculated. This provides an indication of how urban a segmento is and it can also shed some light on the local economic dynamism.

# shop
shop_poly_seg=sp::over(segmento_sh_wgs,
                       as(subset(poi_poly_sh,
                                 is.na(shop)==F),
                          "SpatialPolygons"),
                       fn=sum,na.rm=T)
shop_poly_seg[which(is.na(shop_poly_seg))]=0

shop_points_seg=sp::over(segmento_sh_wgs,
                         as(subset(poi_points_sh,
                                   is.na(shop)==F),
                            "SpatialPoints"),
                         fn=sum,na.rm=T)
shop_points_seg[which(is.na(shop_points_seg))]=0

shop_n=shop_points_seg+shop_poly_seg

# add the shop data to segmento shape
segmento_sh_wgs@data$shop=shop_n

Second, the number of amenities according to 11 categories is computed.

## Warning in kableExtra::column_spec(., 3, width = "7cm"): Please specify
## format in kable. kableExtra can customize either HTML or LaTeX outputs. See
## https://haozhu233.github.io/kableExtra/ for details.
Category Feature’s name in the data OSM amenity categories
Financial access points fin bank, atm
Marketplace mkt marketplace
Bus Station bus bus_station
Parking parking parking
Place of Worship worship place_of_worship
Health service health_srv pharmacy, clinic, dentist, doctor, hospital
Food and drink food restaurant, fast_food, cafe, bar, ice_cream, food_court
Culture and Education cult_educ school, college, university, community_centre, theatre, arts_centre, internet_cafe, cinema
Fuel: petrol station fuel fuel
Official buildings official townhall, post_office, courthouse, police, prison, public_building, fire_station

The for loop below execute the extraction.

for(i in 1: length(list_amenities)){
  extracted_poly_seg=sp::over(segmento_sh_wgs,
                              as(subset(poi_poly_sh,
                                        amenity%in%list_amenities[[i]]),
                                 "SpatialPolygons"),
                              fn=sum,na.rm=T)
  extracted_poly_seg[which(is.na(extracted_poly_seg))]=0
  
  
  extracted_point_seg=sp::over(segmento_sh_wgs,
                               as(subset(poi_points_sh,
                                         amenity%in%list_amenities[[i]]),
                                  "SpatialPoints"),
                               fn=sum,na.rm=T)
  extracted_point_seg[which(is.na(extracted_point_seg))]=0
  
  extracted_seg=extracted_point_seg+extracted_poly_seg
  
  assign(list_amenities_name[[i]],
         extracted_seg)
}


# add the shop data to segmento shape
unlist(list_amenities_name)
##  [1] "fin"        "mkt"        "bus"        "parking"    "worship"   
##  [6] "health_srv" "food"       "cult_educ"  "fuel"       "official"
segmento_sh_wgs@data$fin=fin
segmento_sh_wgs@data$mkt=mkt
segmento_sh_wgs@data$bus=bus
segmento_sh_wgs@data$parking=parking
segmento_sh_wgs@data$worship=worship
segmento_sh_wgs@data$health_srv=health_srv
segmento_sh_wgs@data$food=food
segmento_sh_wgs@data$cult_educ=cult_educ
segmento_sh_wgs@data$fuel=fuel
segmento_sh_wgs@data$official=official

Lastly, a series of covariates is computed on the roads data. As the goal is to measure length, the first step is to change the coordinates reference system (CRS) of the roads vector into a projected coordinates system with units expressed in meters. The CRS of the original segmento data is used.

roads_lines_pr=sp::spTransform(roads_lines_sh,
                               sp::proj4string(segmento_sh))

The OSM’ roads’ are then regrouped into nine categories according to their highway class.

## Warning in kableExtra::column_spec(., 3, width = "7cm"): Please specify
## format in kable. kableExtra can customize either HTML or LaTeX outputs. See
## https://haozhu233.github.io/kableExtra/ for details.
Category Feature’s name OSM highway class
Fast track fast_track motorway_link, motorway, trunk_link, trunk, primary_link, primary
Secondary secondary secondary, secondary_link
Tertiary tertiary tertiary_link, tertiary
Residential residential residential, corridor
Rural track rural_track track
Pedestrian and bike line urban_no_motor living_street, cycleway, steps, path, footway, pedestrian, service
Bus lane bus bus_stop, bus_guideway
Luxury luxury raceway, bridleway
Unclassified unclassified no construction, unclassified, road

The length of roads’ segments per segmento and according to the roads’ categories is computed thanks to the function lineLength applied via over. The results is expressed in meters, as the unit of the CRS is meters.

# length
length_all_seg=sp::over(segmento_sh,
                        as(roads_lines_pr,
                           "SpatialLines"),
                        fn=lineLength,
                        na.rm=T)
length_all_seg[is.na(length_all_seg)]=0

for(i in 1:length(list_road_types)){
  
  roads_type_sub=subset(roads_lines_pr,
                        highway %in%list_road_types[[i]])
  
  length_seg=sp::over(segmento_sh,
                      as(roads_type_sub,
                         "SpatialLines"),
                      fn=lineLength,
                      na.rm=T)
  length_seg[is.na(length_seg)]=0
  assign(list_road_names[[i]],
         length_seg)
}

# add road length data to the shapefile
segmento_sh_wgs@data$all_roads=length_all_seg
segmento_sh_wgs@data$fast_track=fast_track
segmento_sh_wgs@data$secondary=secondary
segmento_sh_wgs@data$tertiary=tertiary
segmento_sh_wgs@data$residential=residential
segmento_sh_wgs@data$rural_track=rural_track
segmento_sh_wgs@data$urban_no_motor=urban_no_motor
segmento_sh_wgs@data$bus=bus
segmento_sh_wgs@data$luxury=luxury
segmento_sh_wgs@data$unclassified=unclassified

9.7.3.4 Normalizing the road data

The road network is normalized by the area of segmento: we obtain the kilometers of roads per square kilometers. The density of roads is computed as the length of roads divided by the area of segmento. The latter is already computed in the shapefile and stored in the feature Shape_Area.

# density: length km/area
road_density_all=length_all_seg/segmento_sh@data$Shape_Area
for(i in 1:length(list_road_types)){
  
  density_type_i=get(list_road_names[[i]])/segmento_sh_wgs@data$Shape_Area
  assign(paste0("dens_",
                list_road_names[[i]]),
         density_type_i)
}
# add road density data to the shapefile
segmento_sh_wgs@data$dens_all_roads=road_density_all
segmento_sh_wgs@data$dens_fast_track=dens_fast_track
segmento_sh_wgs@data$dens_secondary=dens_secondary
segmento_sh_wgs@data$dens_tertiary=dens_tertiary
segmento_sh_wgs@data$dens_residential=dens_residential
segmento_sh_wgs@data$dens_rural_track=dens_rural_track
segmento_sh_wgs@data$dens_urban_no_motor=dens_urban_no_motor
segmento_sh_wgs@data$dens_bus=dens_bus
segmento_sh_wgs@data$dens_luxury=dens_luxury
segmento_sh_wgs@data$dens_unclassified=dens_unclassified

9.7.4 Population density

We compute the average population density by dividing the segmento total population by its areas.

segmento_sh_wgs@data$pop_dens=segmento_sh_wgs@data$gpw_es_TOT/segmento_sh_wgs@data$Shape_Area/10000

9.8 Merging the covariates with the EHPM data

The data from SpatialPolygonDataframe are saved to a standard data frame.

The EHPM survey data tables/ehpm-2017.csv are loaded, one selects only the development indicators and the the household id idboleta. The idboleta is used to match the EHPM with the segmento ID in tables/Identificador de segmento.xlsx. Lastly, the SEG_ID is used to match the survey data with the predictor.

# save the data stored in segmento_sh_wgs@data in a standard data frame
#select a subset of the covariates
predictors=segmento_sh_wgs@data

# survey data
ehpm17=read.csv(paste0(dir_data,
                       "tables/ehpm-2017.csv"))

# read the files with the segmento ID and the idboleta
segID=readxl::read_xlsx(paste0(dir_data,
                               "tables/Identificador de segmento.xlsx"),
                        sheet="2017")

# select only the dev indicators
ehpm17_dev_indic=ehpm17%>%
  dplyr::select(municauto,
                idboleta,
                pobreza, # poverty
                ingpe, # household income
                r202a, #Sabe leer y escribir
                r106 # age
  ) %>%
  dplyr::rename("lit"="r202a", # rename these 2 variables
                "age"="r106")%>%
  dplyr::mutate(over_15=ifelse(age>=15,1,0),
                literate=ifelse(lit==1,1,0),
                literate_over_15=over_15*literate) # Literate over 15 years old 

# Match the survey data with the segID 
ehpm17_seg=ehpm17_dev_indic%>%
  left_join(segID,
            by="idboleta")%>%
  rename("SEG_ID"="seg_id")

We filter out below all survey participants less than 16 years old. We compute hence development indicators for the population aged 16 years old and above.

# Aggregated the survey data at the segmento level
ehpm17_seg_agg=ehpm17_seg%>%
  filter(over_15==1)%>%
  group_by(SEG_ID)%>%
  summarise(municauto=unique(municauto),
            n_obs=n(),
            pobreza_extrema=sum(pobreza==1,na.rm = T)/n(),
            pobreza_mod=sum(pobreza%in%c(1,2),na.rm = T)/n(),
            literacy_rate=sum(literate_over_15,na.rm = T)/n(),
            ingpe=median(ingpe,na.rm = T))


# Merge the aggregated survey data with the predictor data 
ehpm17_predictors=ehpm17_seg_agg%>%
  right_join(predictors%>%
               mutate(SEG_ID=as.character(SEG_ID)),
             by="SEG_ID")
# write out the file
write.csv(ehpm17_predictors,
          paste0(dir_data,
                 "out/ehpm17_predictors2.csv"),
          row.names = F)
print(paste(ncol(ehpm17_predictors)-21, "predictors"))
## [1] "78 predictors"

The csv file out/ehpm17_predictors.csv contains the 79 predictors derived above plus the following development indicators:

## Warning in kableExtra::column_spec(., 3, width = "7cm"): Please specify
## format in kable. kableExtra can customize either HTML or LaTeX outputs. See
## https://haozhu233.github.io/kableExtra/ for details.
Variable name Definition
pobreza_extrema Proportion of individuals 15 years in extreme poverty: income below
pobreza_mod Proportion of individuals 15 years in moderate poverty: income below $107.26 in urban areas or $66.90 in rural areas
literacy_rate Proportion of individuals 15 years or older who can read and write
ingpe Median income of individuals 15 years or older (USD)

9.9 Summary

The process of going from raw RS layers to a SpatialPolygonDataframe ready for analysis can be summarized as follows:

  • load the RS layers
  • create summaries for RS layers which come with a time dimension (e.g. average precipitation over the last thirthy years)
  • perform additional raster calculation as required (e.g. compute the Soil degradation index based on soil degradation sub-indicator)
  • compute distance metrics to public services, businesses and some natural features
  • align the Coordinates reference systems across layers (geographic or projected)
  • aggregate the RS layers to the map of interest (e.g. the segmento map) with a chosen set of spatial statistics
  • match the RS aggregates with the outcome data (poverty, income, literay)
  • write the data frame to a .csv file for later use

The 22 main data sources listed on the next table are summarized in the SpatialPolygonDataframe segmento_sh_wgs into 79 segmento level spatial statistics listed on table ??. These are the candidate covariates to predict average income, poverty and literacy measured in the in the 2017 EHPM survey.

## Warning in kableExtra::column_spec(., 3, width = "7cm"): Please specify
## format in kable. kableExtra can customize either HTML or LaTeX outputs. See
## https://haozhu233.github.io/kableExtra/ for details.
Category Features
Precipitation chirps_ev_2017, chirps_ev_med
Soil degradation soil_carb, soil_luc, soil_prod_state, soil_prod_prf, soil_prod_trj, soil_deg_SDG_index
Vegetation Greenness (NDVI) ndvi_17_median, ndvi_17_sd, ndvi_17_sum, ndvi_17_perc_median, ndvi_17_perc_sd, ndvi_17_perc_sum
Altitude (DEM) dem
Population density gpw_es, pop_dens, gpw_es_TOT
Temperature temp_median
Livelihood Zones LZCODE,LZNAMEEN
Education dist2schools_priv, dist2schools_pub,dist2schools
Health services dist2health, dist2hosp
Buildings people_building
Shops See distances
Other points of interest See distances
Roads dens_all_roads, dens_fast_track, dens_secondary, dens_tertiary, dens_residential, dens_rural_track, dens_urban_no_motor, dens_bus, dens_luxury, dens_unclassified
Slope slope
Distances dist2road, dist2roadInter, dist2water
Lights at night lights_mean, lights_sd
Land Classes lc_urban,lc_crop,lc_tree
Distance dist2schools_priv, dist2schools_pub, dist2schools, dist2health, dist2hosp, dist2pubamen, dist2allpubserv, dist2biz, dist2fin, dist2coast_r, dist2water_r, dist2crop_r, dist2tree_r, dist2urban_r

In the next next section, we explore these data and their link with average income, poverty and literacy.

References

Fick, Stephen E, and Robert J Hijmans. 2017. “WorldClim 2: New 1-Km Spatial Resolution Climate Surfaces for Global Land Areas.” International Journal of Climatology 37 (12). Wiley Online Library: 4302–15.

Henderson, J Vernon, Adam Storeygard, and David N Weil. 2012. “Measuring Economic Growth from Outer Space.” American Economic Review 102 (2): 994–1028.


  1. http://desktop.arcgis.com/en/arcmap/10.3/guide-books/map-projections/about-geographic-coordinate-systems.htm

  2. http://desktop.arcgis.com/en/arcmap/10.3/guide-books/map-projections/about-projected-coordinate-systems.htm

  3. Plotting these data allowed us in the first instance to identify the error in the naming of the coordinates: latitudes are actually longitudes and vice-versa. We therefore needed to use the rename command.

  4. “A RasterBrick is a multi-layer raster object. They are typically created from a multi-layer (band) file; but they can also exist entirely in memory. They are similar to a RasterStack (that can be created with stack), but processing time should be shorter when using a RasterBrick. Yet they are less flexible as they can only point to a single file” (R. J. Hijmans 2019b)

  5. TRENDS.EARTH (formerly the Land Degradation Monitoring Toolbox) is a tool utilized to monitor land change. It is a QGIS plugin that supports monitoring of land change, including changes in productivity, land cover, and soil organic carbon. The tool supports land degradation monitoring for reporting to the Global Environment Facility (GEF) and United Nations Convention to Combat Desertification (UNCCD). Furthermore, it is also used to track the progress towards the achievement of Sustainable Development Goal (SDG) target 15.3 i.e. Land Degradation Neutrality (LDN).

  6. the function raster::distanceFromPoints is not very efficient and it takes a few second to run. In countries with larger territories, and hence larger data volume, running GRASS from R via ‘rgrass7’ is the way to go

  7. Other spatial summaries could be investigated (e.g. average, median, min, max, standard deviation, sum etc.), however we focus on the mean for the sake of expediency

  8. The process took circa 2. minute and 30 seconds with velox while it did not complete with raster::extract after 2 hours on 16GB RAM PC operating under Windows 10 an Intel(R) Core(TM) i7-7700 HQ CPU @2.8 GHz.“Velox is fast because all raster computations are performed in C++ (using the excellent Rcpp API), and all data is held in memory”(Hunziker 2017)

  9. we focus on class 50 and 60 as those are the most present